The Diploid Genome Sequence of an Individual Human

نویسندگان

  • Samuel Levy
  • Granger Sutton
  • Pauline C Ng
  • Lars Feuk
  • Aaron L Halpern
  • Brian P Walenz
  • Nelson Axelrod
  • Jiaqi Huang
  • Ewen F Kirkness
  • Gennady Denisov
  • Yuan Lin
  • Jeffrey R MacDonald
  • Andy Wing Chun Pang
  • Mary Shago
  • Timothy B Stockwell
  • Alexia Tsiamouri
  • Vineet Bafna
  • Vikas Bansal
  • Saul A Kravitz
  • Dana A Busam
  • Karen Y Beeson
  • Tina C McIntosh
  • Karin A Remington
  • Josep F Abril
  • John Gill
  • Jon Borman
  • Yu-Hui Rogers
  • Marvin E Frazier
  • Stephen W Scherer
  • Robert L Strausberg
  • J. Craig Venter
چکیده

Presented here is a genome sequence of an individual human. It was produced from approximately 32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb) of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel) included 3,213,401 single nucleotide polymorphisms (SNPs), 53,823 block substitutions (2-206 bp), 292,102 heterozygous insertion/deletion events (indels)(1-571 bp), 559,473 homozygous indels (1-82,711 bp), 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparative bioinformatics analysis of a wild diploid Gossypium with two cultivated allotetraploid species

Background: Gossypium thurberi is a wild diploid species that has been used to improve cultivated allotetraploid cotton. G. thurberi belongs to D genome, which is an important wild bio-source for the cotton breeding and genetic research. To a certain degree, chloroplast DNA sequence information are a versatile tool for species identification and phylogenetic implications in plants. Different ch...

متن کامل

I-38: Chromosome Instability in The Cleavage Stage Embryo

Recently, we demonstrated chromosome instability (CIN) in human cleavage stage embryogenesis following in vitro fertilization (IVF). CIN not necessarily undermines normal human development (i.e. when remaining normal diploid blastomeres develop the embryo proper), however it can spark a spectrum of conditions, including loss of conception, genetic disease and genetic variation development. To s...

متن کامل

The HuRef Browser: a web resource for individual human genomics

The HuRef Genome Browser is a web application for the navigation and analysis of the previously published genome of a human individual, termed HuRef. The browser provides a comparative view between the NCBI human reference sequence and the HuRef assembly, and it enables the navigation of the HuRef genome in the context of HuRef, NCBI and Ensembl annotations. Single nucleotide polymorphisms, ind...

متن کامل

Consensus generation and variant detection by Celera Assembler

MOTIVATION We present an algorithm to identify allelic variation given a Whole Genome Shotgun (WGS) assembly of haploid sequences, and to produce a set of haploid consensus sequences rather than a single consensus sequence. Existing WGS assemblers take a column-by-column approach to consensus generation, and produce a single consensus sequence which can be inconsistent with the underlying haplo...

متن کامل

Comparison between conventional PCR and PCR - ELISA for detection of Brucella melitensis

Molecular detection techniques are believed to be key tools for both prevention and treatment follow up of brucellosis within live stock and human beings. Consequently rapid, reliable, easy to perform and automated systems for Brucella detection are urgently needed to allow early diagnosis and adequate antibiotic therapy in time. Brucellosis is a worldwide re-emerging zoonosis causing high econ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PLoS Biology

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2007